Modeling speech imitation and ecological learning of auditory-motor maps

نویسندگان

  • Claudia Canevari
  • Leonardo Badino
  • Alessandro D'Ausilio
  • Luciano Fadiga
  • Giorgio Metta
چکیده

Classical models of speech consider an antero-posterior distinction between perceptive and productive functions. However, the selective alteration of neural activity in speech motor centers, via transcranial magnetic stimulation, was shown to affect speech discrimination. On the automatic speech recognition (ASR) side, the recognition systems have classically relied solely on acoustic data, achieving rather good performance in optimal listening conditions. The main limitations of current ASR are mainly evident in the realistic use of such systems. These limitations can be partly reduced by using normalization strategies that minimize inter-speaker variability by either explicitly removing speakers' peculiarities or adapting different speakers to a reference model. In this paper we aim at modeling a motor-based imitation learning mechanism in ASR. We tested the utility of a speaker normalization strategy that uses motor representations of speech and compare it with strategies that ignore the motor domain. Specifically, we first trained a regressor through state-of-the-art machine learning techniques to build an auditory-motor mapping, in a sense mimicking a human learner that tries to reproduce utterances produced by other speakers. This auditory-motor mapping maps the speech acoustics of a speaker into the motor plans of a reference speaker. Since, during recognition, only speech acoustics are available, the mapping is necessary to "recover" motor information. Subsequently, in a phone classification task, we tested the system on either one of the speakers that was used during training or a new one. Results show that in both cases the motor-based speaker normalization strategy slightly but significantly outperforms all other strategies where only acoustics is taken into account.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to associate speech-like sensory and motor states during babbling

Background: Development of the feedback loop of speech production starts during the babbling phase of speech acquisition. Within the first year of lifetime toddlers acquire the ability of imitating auditory stimuli, i.e. they acquire the ability of associating speech-like sensory and motor states. Method: Self-organizing maps and one-layer feed-forward networks were used for modeling this learn...

متن کامل

How Children Learn to Pronounce: Not by Imitation but by Their Mothers’ Vocal Mirroring

It is generally assumed that children learn to pronounce speech sounds by imitation from adult models. This requires that a child creates some form of representation for a speech sound in a single modality, which he uses for both perception and production. It is usually imagined that this underlying representation is auditory/acoustic, but arguments can also be made for motor/gestural alternati...

متن کامل

A computational model of perceptuo-motor processing in speech perception: learning to imitate and categorize synthetic CV syllables

This paper presents COSMO, a Bayesian computational model, which is expressive enough to carry out syllable production, perception and imitation tasks using motor, auditory or perceptuo-motor information. An imitation algorithm enables to learn the articulatory-to-acoustic mapping and the link between syllables and corresponding articulatory gestures, from acoustic inputs only: synthetic CV syl...

متن کامل

Neural mechanisms for learned birdsong.

Learning by imitation is essential for transmitting many aspects of human culture, including speech, language, art, and music. How the human brain enables imitation remains a mystery, but the underlying neural mechanisms must harness sensory feedback to adaptively modify performance in reference to the object of imitation. Although examples of imitative learning in nonhuman animals are relative...

متن کامل

Converging toward a common speech code: imitative and perceptuo-motor recalibration processes in speech production

Auditory and somatosensory systems play a key role in speech motor control. In the act of speaking, segmental speech movements are programmed to reach phonemic sensory goals, which in turn are used to estimate actual sensory feedback in order to further control production. The adult's tendency to automatically imitate a number of acoustic-phonetic characteristics in another speaker's speech how...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2013